Generative AI in Croatian Education

A Media Frame Analysis (2023–2025)

Author

Lux

Published

December 1, 2025

Executive Summary

This report analyzes Croatian web media coverage of Generative AI in education from 2023 to 2025. Using computational frame analysis and natural language processing, we examine how media narratives have evolved from initial panic to gradual integration.

Key Findings
  • Coverage volume: Substantial media attention with identifiable peaks around key events
  • Dominant frames: OPPORTUNITY and REGULATION frames predominate over THREAT
  • Narrative evolution: Clear shift from panic-focused to integration-focused coverage
  • Source variation: Significant differences in framing between outlet types

1 Introduction

1.1 Background

The release of ChatGPT in November 2022 triggered a global conversation about artificial intelligence in education. Croatia, like many countries, witnessed intense media debate about the implications of generative AI for students, teachers, and educational institutions.

1.2 Research Questions

This analysis addresses four core questions:

  1. Volume & Timing: How much coverage exists, and when did it peak?
  2. Framing: Which interpretive frames dominate, and how do they shift over time?
  3. Actors: Who is represented in coverage, and who is given voice?
  4. Sources: Do different media types frame AI in education differently?

1.3 Theoretical Framework

Our analysis draws on:

  • Framing Theory (Entman, 1993): Media frames as patterns of selection and emphasis
  • Moral Panic Theory (Cohen, 1972): Technology adoption often follows panic cycles
  • Diffusion of Innovations (Rogers, 1962): Media coverage mirrors adoption stages

2 Data and Methods

2.1 Data Source

Show code
# Load the pre-processed data
raw_data <- read.xlsx("./dta.xlsx")

cat("Dataset loaded successfully\n")
Dataset loaded successfully
Show code
cat("Total articles:", format(nrow(raw_data), big.mark = ","), "\n")
Total articles: 4,424 
Show code
cat("Columns:", ncol(raw_data), "\n")
Columns: 48 

2.2 Data Processing

Show code
# Validate and clean data
validated_data <- raw_data %>%
  filter(!is.na(FULL_TEXT) & !is.na(TITLE))

# Parse dates
clean_data <- validated_data %>%
  mutate(
    DATE = as.Date(DATE),
    year = year(DATE),
    month = month(DATE),
    year_month = floor_date(DATE, "month"),
    week = floor_date(DATE, "week"),
    quarter = quarter(DATE),
    word_count = str_count(FULL_TEXT, "\\S+"),
    article_id = row_number()
  ) %>%
  filter(!is.na(DATE)) %>%
  distinct(TITLE, DATE, .keep_all = TRUE) %>%
  arrange(DATE)

# Day of week
clean_data$day_of_week <- wday(clean_data$DATE, label = TRUE, abbr = FALSE)

cat("Articles after cleaning:", format(nrow(clean_data), big.mark = ","), "\n")
Articles after cleaning: 3,873 
Show code
cat("Date range:", as.character(min(clean_data$DATE)), "to", as.character(max(clean_data$DATE)), "\n")
Date range: 2021-01-01 to 2024-07-04 

2.3 Frame Dictionaries

We developed Croatian-language dictionaries for eight interpretive frames:

Show code
frame_dictionaries <- list(
  THREAT = c(
    "prijetnja", "opasnost", "opasno", "rizik", "rizično",
    "varanje", "varati", "prevara", "plagijat", "plagiranje",
    "prepisivanje", "zabrana", "zabraniti", "zabranjeno",
    "uništiti", "uništava", "smrt", "kraj", "propast",
    "kriza", "alarm", "upozorenje", "šteta", "štetno",
    "strah", "bojati", "panika"
  ),
  
  OPPORTUNITY = c(
    "alat", "sredstvo", "pomoć", "pomoćnik", "asistent",
    "prilika", "mogućnost", "potencijal", "prednost", "korist",
    "poboljšati", "poboljšanje", "unaprijediti", "napredak",
    "učinkovit", "učinkovitost", "efikasan", "produktivnost",
    "budućnost", "inovacija", "inovativan", "revolucija",
    "moderan", "modernizacija", "transformacija",
    "uspjeh", "uspješno", "izvrsno"
  ),
  
  REGULATION = c(
    "pravilnik", "pravilo", "propisi", "regulativa",
    "smjernice", "upute", "protokol",
    "zakon", "zakonski", "pravni",
    "ministarstvo", "ministar", "vlada",
    "dopušteno", "dopuštenje", "dozvola",
    "primjena", "provedba", "implementacija",
    "odluka", "mjera"
  ),
  
  DISRUPTION = c(
    "promjena", "promijeniti", "transformacija", "preobrazba",
    "prilagodba", "prilagoditi", "adaptacija",
    "neizbježno", "nezaustavljivo",
    "revolucija", "prekretnica", "nova era", "novi način",
    "evolucija", "disrupcija"
  ),
  
  REPLACEMENT = c(
    "zamjena", "zamijeniti", "zamjenjuje", "istisnuti",
    "gubitak posla", "nepotreban", "suvišan", "zastario",
    "automatizacija", "automatizirano",
    "nadmašiti", "bolji od čovjeka"
  ),
  
  QUALITY = c(
    "halucinacija", "halucinacije", "greška", "greške",
    "netočno", "netočnost", "pogrešno",
    "pouzdanost", "pouzdan", "nepouzdan",
    "provjera", "provjeriti", "verifikacija",
    "kvaliteta", "kritički", "kritičko mišljenje"
  ),
  
  EQUITY = c(
    "nejednakost", "nejednako", "jaz", "razlika",
    "pristup", "pristupačnost", "dostupnost",
    "digitalni jaz", "siromašan", "socioekonomski",
    "pravednost", "pravedno", "nepravedno"
  ),
  
  COMPETENCE = c(
    "vještine", "vještina", "kompetencije",
    "sposobnost", "pismenost", "digitalna pismenost",
    "kritičko mišljenje", "analitičko mišljenje",
    "učiti", "obrazovanje", "edukacija", "usavršavanje"
  )
)

# Actor dictionaries
actor_dictionaries <- list(
  STUDENTS = c("student", "studenti", "učenik", "učenici", "đak", "maturant", "brucoš"),
  TEACHERS = c("učitelj", "učitelji", "nastavnik", "profesor", "profesori", "predavač", "mentor"),
  ADMINISTRATORS = c("ravnatelj", "dekan", "rektor", "prorektor", "voditelj"),
  INSTITUTIONS = c("škola", "škole", "fakultet", "sveučilište", "ministarstvo", "carnet"),
  TECH_COMPANIES = c("openai", "microsoft", "google", "chatgpt", "gpt", "gemini", "copilot"),
  EXPERTS = c("stručnjak", "ekspert", "znanstvenik", "istraživač", "analitičar"),
  POLICY_MAKERS = c("ministar", "zastupnik", "premijer", "vlada", "sabor")
)

# Sentiment dictionaries
sentiment_positive <- c(
  "dobar", "dobro", "odličan", "sjajan", "izvrstan", "fantastičan",
  "pozitivan", "uspješan", "uspjeh", "napredak", "poboljšanje",
  "zadovoljan", "optimizam", "nada", "kvalitetan", "koristan"
)

sentiment_negative <- c(
  "loš", "loše", "negativan", "grozan", "užasan", "katastrofa",
  "problem", "neuspjeh", "propast", "pogoršanje",
  "nezadovoljan", "razočaran", "pesimizam", "strah",
  "nekvalitetan", "beskoristan"
)

cat("Frame dictionaries created:", length(frame_dictionaries), "frames\n")
Frame dictionaries created: 8 frames
Show code
cat("Actor dictionaries created:", length(actor_dictionaries), "actor types\n")
Actor dictionaries created: 7 actor types

2.4 Frame Detection

Show code
# Function to detect frames
detect_frames <- function(text, dictionaries) {
  if (is.na(text)) return(setNames(rep(0, length(dictionaries)), names(dictionaries)))
  text_lower <- str_to_lower(text)
  sapply(names(dictionaries), function(frame_name) {
    pattern <- paste0("\\b(", paste(dictionaries[[frame_name]], collapse = "|"), ")")
    sum(str_count(text_lower, pattern))
  })
}

detect_frame_presence <- function(text, dictionaries) {
  if (is.na(text)) return(setNames(rep(FALSE, length(dictionaries)), names(dictionaries)))
  text_lower <- str_to_lower(text)
  sapply(names(dictionaries), function(frame_name) {
    pattern <- paste0("\\b(", paste(dictionaries[[frame_name]], collapse = "|"), ")")
    str_detect(text_lower, pattern)
  })
}

# Apply frame analysis (with progress indicator)
message("Applying frame analysis...")

frame_results <- lapply(seq_len(nrow(clean_data)), function(i) {
  combined_text <- paste(clean_data$TITLE[i], clean_data$FULL_TEXT[i], sep = " ")
  
  frame_counts <- detect_frames(combined_text, frame_dictionaries)
  frame_presence <- detect_frame_presence(combined_text, frame_dictionaries)
  actor_counts <- detect_frames(combined_text, actor_dictionaries)
  actor_presence <- detect_frame_presence(combined_text, actor_dictionaries)
  
  # Sentiment
  text_lower <- str_to_lower(combined_text)
  pos_count <- sum(str_count(text_lower, paste0("\\b(", paste(sentiment_positive, collapse = "|"), ")")))
  neg_count <- sum(str_count(text_lower, paste0("\\b(", paste(sentiment_negative, collapse = "|"), ")")))
  
  c(
    setNames(frame_counts, paste0("frame_", names(frame_counts), "_count")),
    setNames(frame_presence, paste0("frame_", names(frame_presence), "_present")),
    setNames(actor_counts, paste0("actor_", names(actor_counts), "_count")),
    setNames(actor_presence, paste0("actor_", names(actor_presence), "_present")),
    sentiment_POSITIVE_count = pos_count,
    sentiment_NEGATIVE_count = neg_count
  )
})

frame_df <- bind_rows(lapply(frame_results, as.data.frame.list))
clean_data <- bind_cols(clean_data, frame_df)

# Calculate derived metrics
clean_data <- clean_data %>%
  mutate(
    dominant_frame = apply(
      select(., starts_with("frame_") & ends_with("_count") & !contains("frame_count")), 1,
      function(x) {
        frame_names <- c("THREAT", "OPPORTUNITY", "REGULATION", "DISRUPTION", 
                         "REPLACEMENT", "QUALITY", "EQUITY", "COMPETENCE")
        if (all(x == 0)) return("NONE")
        frame_names[which.max(x)]
      }
    ),
    frame_intensity = rowSums(select(., starts_with("frame_") & ends_with("_count") & !contains("frame_count"))),
    frame_count = rowSums(select(., starts_with("frame_") & ends_with("_present"))),
    sentiment_score = sentiment_POSITIVE_count - sentiment_NEGATIVE_count,
    sentiment_category = case_when(
      sentiment_score > 2 ~ "Positive",
      sentiment_score < -2 ~ "Negative",
      TRUE ~ "Neutral"
    ),
    primary_actor = apply(
      select(., starts_with("actor_") & ends_with("_count")), 1,
      function(x) {
        actor_names <- c("STUDENTS", "TEACHERS", "ADMINISTRATORS", "INSTITUTIONS",
                         "TECH_COMPANIES", "EXPERTS", "POLICY_MAKERS")
        if (all(x == 0)) return("NONE")
        actor_names[which.max(x)]
      }
    ),
    narrative_phase = case_when(
      DATE < as.Date("2023-06-01") ~ "Phase 1: Emergence",
      DATE < as.Date("2024-01-01") ~ "Phase 2: Debate",
      DATE < as.Date("2024-09-01") ~ "Phase 3: Integration",
      TRUE ~ "Phase 4: Normalization"
    )
  )

cat("Frame analysis complete.\n")
Frame analysis complete.
Show code
cat("Articles with at least one frame:", sum(clean_data$frame_count > 0), "\n")
Articles with at least one frame: 3154 

3 Results

3.1 Coverage Overview

3.1.1 Dataset Summary

Show code
summary_stats <- tibble(
  Metric = c(
    "Total Articles",
    "Date Range",
    "Unique Sources",
    "Total Words Analyzed",
    "Mean Article Length (words)",
    "Articles with Frame Detected"
  ),
  Value = c(
    format(nrow(clean_data), big.mark = ","),
    paste(min(clean_data$DATE), "to", max(clean_data$DATE)),
    format(n_distinct(clean_data$FROM), big.mark = ","),
    format(sum(clean_data$word_count, na.rm = TRUE), big.mark = ","),
    format(round(mean(clean_data$word_count, na.rm = TRUE)), big.mark = ","),
    paste0(format(sum(clean_data$frame_count > 0), big.mark = ","), 
           " (", round(mean(clean_data$frame_count > 0) * 100, 1), "%)")
  )
)

kable(summary_stats, align = c("l", "r")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Table 1: Dataset Overview
Metric Value
Total Articles 3,873
Date Range 2021-01-01 to 2024-07-04
Unique Sources 512
Total Words Analyzed 1,483,958
Mean Article Length (words) 383
Articles with Frame Detected 3,154 (81.4%)

3.1.2 Temporal Distribution

Show code
monthly_stats <- clean_data %>%
  group_by(year_month) %>%
  summarise(
    n_articles = n(),
    prop_THREAT = mean(frame_THREAT_present, na.rm = TRUE),
    prop_OPPORTUNITY = mean(frame_OPPORTUNITY_present, na.rm = TRUE),
    prop_REGULATION = mean(frame_REGULATION_present, na.rm = TRUE),
    mean_sentiment = mean(sentiment_score, na.rm = TRUE),
    .groups = "drop"
  )

ggplot(monthly_stats, aes(x = year_month, y = n_articles)) +
  geom_col(fill = "#2c7bb6", alpha = 0.8) +
  geom_smooth(method = "loess", se = TRUE, color = "#d7191c", linewidth = 1.2) +
  scale_x_date(date_breaks = "3 months", date_labels = "%b\n%Y") +
  labs(
    title = "Media Coverage of AI in Croatian Education",
    subtitle = "Monthly article count with trend line",
    x = NULL, 
    y = "Number of Articles"
  )
Figure 1: Monthly Coverage Volume

3.1.3 Day of Week Patterns

Show code
dow_stats <- clean_data %>%
  filter(!is.na(day_of_week)) %>%
  count(day_of_week) %>%
  mutate(percentage = n / sum(n) * 100)

ggplot(dow_stats, aes(x = day_of_week, y = n)) +
  geom_col(fill = "#2c7bb6", alpha = 0.8) +
  geom_text(aes(label = paste0(round(percentage, 1), "%")), vjust = -0.5, size = 3.5) +
  labs(
    title = "Publication Day Patterns",
    x = NULL, 
    y = "Number of Articles"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
Figure 2: Publication Patterns by Day of Week

3.2 Frame Analysis

3.2.1 Dominant Frames

Show code
frame_dist <- clean_data %>%
  count(dominant_frame, sort = TRUE) %>%
  mutate(
    percentage = n / sum(n) * 100,
    dominant_frame = factor(dominant_frame, levels = dominant_frame)
  )

ggplot(frame_dist, aes(x = reorder(dominant_frame, n), y = n, fill = dominant_frame)) +
  geom_col() +
  geom_text(aes(label = paste0(round(percentage, 1), "%")), hjust = -0.1, size = 3.5) +
  scale_fill_manual(values = frame_colors) +
  coord_flip() +
  labs(
    title = "Distribution of Dominant Frames",
    subtitle = "Based on highest frame word count per article",
    x = NULL, 
    y = "Number of Articles"
  ) +
  theme(legend.position = "none") +
  expand_limits(y = max(frame_dist$n) * 1.15)
Figure 3: Distribution of Dominant Frames

3.2.2 Frame Evolution Over Time

Show code
frame_evolution <- monthly_stats %>%
  select(year_month, prop_THREAT, prop_OPPORTUNITY, prop_REGULATION) %>%
  pivot_longer(-year_month, names_to = "frame", values_to = "proportion") %>%
  mutate(frame = str_remove(frame, "prop_"))

ggplot(frame_evolution, aes(x = year_month, y = proportion, color = frame)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 2) +
  scale_color_manual(values = frame_colors) +
  scale_y_continuous(labels = scales::percent) +
  scale_x_date(date_breaks = "3 months", date_labels = "%b\n%Y") +
  labs(
    title = "Frame Prevalence Over Time",
    subtitle = "Proportion of articles containing each frame",
    x = NULL, 
    y = "Proportion of Articles",
    color = "Frame"
  )
Figure 4: Evolution of Media Frames Over Time

3.2.3 Frame Co-occurrence

Show code
frame_cols <- clean_data %>%
  select(starts_with("frame_") & ends_with("_present"))

if (ncol(frame_cols) > 1) {
  frame_cooccur <- crossprod(as.matrix(frame_cols))
  diag_vals <- diag(frame_cooccur)
  diag_vals[diag_vals == 0] <- 1
  frame_cooccur_norm <- frame_cooccur / diag_vals
  
  frame_cooccur_df <- as.data.frame(frame_cooccur_norm)
  frame_cooccur_df$frame1 <- rownames(frame_cooccur_df)
  frame_cooccur_df <- frame_cooccur_df %>%
    pivot_longer(-frame1, names_to = "frame2", values_to = "cooccurrence") %>%
    mutate(
      frame1 = str_extract(frame1, "(?<=frame_)[A-Z]+"),
      frame2 = str_extract(frame2, "(?<=frame_)[A-Z]+")
    ) %>%
    filter(!is.na(frame1) & !is.na(frame2))
  
  ggplot(frame_cooccur_df, aes(x = frame1, y = frame2, fill = cooccurrence)) +
    geom_tile(color = "white") +
    geom_text(aes(label = round(cooccurrence, 2)), size = 3) +
    scale_fill_viridis_c(option = "magma") +
    labs(
      title = "Frame Co-occurrence Matrix",
      subtitle = "Normalized by diagonal (self-occurrence)",
      x = NULL, y = NULL, fill = "Co-occurrence"
    ) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1))
}
Figure 5: Frame Co-occurrence Matrix

3.3 Narrative Phases

Show code
phase_stats <- clean_data %>%
  group_by(narrative_phase) %>%
  summarise(
    n = n(),
    threat = mean(frame_THREAT_present, na.rm = TRUE) * 100,
    opportunity = mean(frame_OPPORTUNITY_present, na.rm = TRUE) * 100,
    regulation = mean(frame_REGULATION_present, na.rm = TRUE) * 100,
    .groups = "drop"
  ) %>%
  mutate(narrative_phase = factor(narrative_phase, levels = c(
    "Phase 1: Emergence", "Phase 2: Debate", "Phase 3: Integration", "Phase 4: Normalization"
  )))

phase_long <- phase_stats %>%
  select(narrative_phase, threat, opportunity, regulation) %>%
  pivot_longer(-narrative_phase, names_to = "frame", values_to = "percentage") %>%
  mutate(frame = str_to_title(frame))

ggplot(phase_long, aes(x = narrative_phase, y = percentage, fill = frame)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("Threat" = "#e41a1c", "Opportunity" = "#4daf4a", "Regulation" = "#377eb8")) +
  labs(
    title = "Frame Distribution by Narrative Phase",
    subtitle = "How dominant frames shift across coverage periods",
    x = NULL, 
    y = "Percentage of Articles",
    fill = "Frame"
  ) +
  theme(axis.text.x = element_text(angle = 15, hjust = 1))
Figure 6: Frame Distribution by Narrative Phase
Show code
phase_table <- clean_data %>%
  group_by(narrative_phase) %>%
  summarise(
    `Articles` = n(),
    `Date Range` = paste(min(DATE), "—", max(DATE)),
    `Mean Sentiment` = round(mean(sentiment_score, na.rm = TRUE), 2),
    `% Threat Frame` = paste0(round(mean(frame_THREAT_present, na.rm = TRUE) * 100, 1), "%"),
    `% Opportunity Frame` = paste0(round(mean(frame_OPPORTUNITY_present, na.rm = TRUE) * 100, 1), "%"),
    .groups = "drop"
  )

kable(phase_table) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Table 2: Summary Statistics by Narrative Phase
narrative_phase Articles Date Range Mean Sentiment % Threat Frame % Opportunity Frame
Phase 1: Emergence 2659 2021-01-01 — 2023-05-31 0.07 37.4% 62.2%
Phase 2: Debate 1050 2023-06-01 — 2023-12-30 0.31 36.7% 70.4%
Phase 3: Integration 164 2024-01-02 — 2024-07-04 0.28 30.5% 73.2%

3.4 Sentiment Analysis

Show code
ggplot(monthly_stats, aes(x = year_month)) +
  geom_ribbon(aes(ymin = 0, ymax = pmax(mean_sentiment, 0)), fill = "#4daf4a", alpha = 0.5) +
  geom_ribbon(aes(ymin = pmin(mean_sentiment, 0), ymax = 0), fill = "#e41a1c", alpha = 0.5) +
  geom_line(aes(y = mean_sentiment), linewidth = 1.2, color = "black") +
  geom_hline(yintercept = 0, linetype = "dashed") +
  scale_x_date(date_breaks = "3 months", date_labels = "%b\n%Y") +
  labs(
    title = "Sentiment Trajectory Over Time",
    subtitle = "Mean sentiment score (positive − negative word counts)",
    x = NULL, 
    y = "Mean Sentiment Score"
  )
Figure 7: Sentiment Trajectory Over Time
Show code
sentiment_dist <- clean_data %>%
  count(sentiment_category) %>%
  mutate(percentage = n / sum(n) * 100)

ggplot(sentiment_dist, aes(x = sentiment_category, y = n, fill = sentiment_category)) +
  geom_col() +
  geom_text(aes(label = paste0(round(percentage, 1), "%")), vjust = -0.3, size = 4) +
  scale_fill_manual(values = sentiment_colors) +
  labs(
    title = "Sentiment Distribution",
    x = NULL, 
    y = "Number of Articles"
  ) +
  theme(legend.position = "none") +
  expand_limits(y = max(sentiment_dist$n) * 1.1)
Figure 8: Distribution of Sentiment Categories

3.5 Actor Representation

Show code
actor_frequency <- clean_data %>%
  summarise(
    across(starts_with("actor_") & ends_with("_count"), sum),
    across(starts_with("actor_") & ends_with("_present"), sum)
  ) %>%
  pivot_longer(everything(), names_to = "metric", values_to = "value") %>%
  mutate(
    type = ifelse(str_detect(metric, "total_|_count"), "Total Mentions", "Articles Present"),
    actor = str_extract(metric, "(?<=actor_)[A-Z_]+") %>%
      str_replace_all("_", " ") %>%
      str_to_title()
  ) %>%
  filter(type == "Total Mentions") %>%
  arrange(desc(value))

ggplot(actor_frequency, aes(x = reorder(actor, value), y = value)) +
  geom_col(fill = "#2c7bb6", alpha = 0.8) +
  coord_flip() +
  labs(
    title = "Actor Representation in Coverage",
    subtitle = "Total mentions across all articles",
    x = NULL, 
    y = "Total Mentions"
  )
Figure 9: Actor Representation in Coverage
Show code
actor_frame_assoc <- clean_data %>%
  filter(primary_actor != "NONE") %>%
  group_by(primary_actor) %>%
  summarise(
    n = n(),
    threat = mean(frame_THREAT_present, na.rm = TRUE) * 100,
    opportunity = mean(frame_OPPORTUNITY_present, na.rm = TRUE) * 100,
    regulation = mean(frame_REGULATION_present, na.rm = TRUE) * 100,
    .groups = "drop"
  ) %>%
  arrange(desc(n))

actor_frame_long <- actor_frame_assoc %>%
  select(primary_actor, threat, opportunity, regulation) %>%
  pivot_longer(-primary_actor, names_to = "frame", values_to = "percentage") %>%
  mutate(frame = str_to_title(frame))

ggplot(actor_frame_long, aes(x = reorder(primary_actor, percentage), y = percentage, fill = frame)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("Threat" = "#e41a1c", "Opportunity" = "#4daf4a", "Regulation" = "#377eb8")) +
  coord_flip() +
  labs(
    title = "Frame Prevalence by Primary Actor",
    subtitle = "Which frames appear when each actor is prominent",
    x = NULL, 
    y = "Percentage of Articles",
    fill = "Frame"
  )
Figure 10: Actor-Frame Associations

3.6 Source Analysis

Show code
# Classify outlets
outlet_classification <- tribble(
  ~pattern,                    ~outlet_type,
  "24sata",                    "Tabloid",
  "index",                     "Tabloid",
  "jutarnji",                  "Quality",
  "vecernji",                  "Quality",
  "slobodna.*dalmacija",       "Regional",
  "novi.*list",                "Regional",
  "dnevnik",                   "Quality",
  "hrt",                       "Public",
  "n1",                        "Quality",
  "net\\.hr",                  "Tabloid",
  "tportal",                   "Quality",
  "bug",                       "Tech",
  "skolski.*portal",           "Education",
  "srednja",                   "Education",
  "poslovni",                  "Business",
  "lider",                     "Business"
)

clean_data$outlet_type <- "Other"
for (i in seq_len(nrow(outlet_classification))) {
  matches <- str_detect(str_to_lower(clean_data$FROM), outlet_classification$pattern[i])
  clean_data$outlet_type[matches] <- outlet_classification$outlet_type[i]
}
Show code
outlet_type_stats <- clean_data %>%
  group_by(outlet_type) %>%
  summarise(
    n_articles = n(),
    pct_threat = mean(frame_THREAT_present, na.rm = TRUE) * 100,
    pct_opportunity = mean(frame_OPPORTUNITY_present, na.rm = TRUE) * 100,
    pct_regulation = mean(frame_REGULATION_present, na.rm = TRUE) * 100,
    mean_sentiment = mean(sentiment_score, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(n_articles))

outlet_type_long <- outlet_type_stats %>%
  select(outlet_type, pct_threat, pct_opportunity, pct_regulation) %>%
  pivot_longer(-outlet_type, names_to = "frame", values_to = "percentage") %>%
  mutate(frame = str_remove(frame, "pct_") %>% str_to_title())

ggplot(outlet_type_long, aes(x = reorder(outlet_type, percentage), y = percentage, fill = frame)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("Threat" = "#e41a1c", "Opportunity" = "#4daf4a", "Regulation" = "#377eb8")) +
  coord_flip() +
  labs(
    title = "Frame Usage by Outlet Type",
    subtitle = "How different media types frame AI in education",
    x = NULL, 
    y = "Percentage of Articles",
    fill = "Frame"
  )
Figure 11: Coverage by Outlet Type
Show code
outlet_summary <- outlet_type_stats %>%
  mutate(
    `Mean Sentiment` = round(mean_sentiment, 2),
    `% Threat` = paste0(round(pct_threat, 1), "%"),
    `% Opportunity` = paste0(round(pct_opportunity, 1), "%"),
    `% Regulation` = paste0(round(pct_regulation, 1), "%")
  ) %>%
  select(
    `Outlet Type` = outlet_type,
    Articles = n_articles,
    `Mean Sentiment`,
    `% Threat`,
    `% Opportunity`,
    `% Regulation`
  )

kable(outlet_summary) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Table 3: Summary Statistics by Outlet Type
Outlet Type Articles Mean Sentiment % Threat % Opportunity % Regulation
Other 2879 0.17 35% 61.8% 22.9%
Quality 365 -0.12 46% 78.9% 31.8%
Regional 192 0.26 44.8% 58.9% 23.4%
Tabloid 172 0.23 31.4% 62.2% 23.8%
Education 91 0.22 46.2% 91.2% 37.4%
Business 76 0.12 57.9% 90.8% 39.5%
Tech 50 0.32 26% 88% 30%
Public 48 -0.12 31.2% 62.5% 18.8%

3.7 Statistical Tests

3.7.1 Frame-Outlet Association

Show code
frame_outlet_table <- table(clean_data$dominant_frame, clean_data$outlet_type)
chisq_result <- chisq.test(frame_outlet_table)

cat("Chi-Square Test: Dominant Frame vs. Outlet Type\n")
Chi-Square Test: Dominant Frame vs. Outlet Type
Show code
cat("X² =", round(chisq_result$statistic, 2), "\n")
X² = 210.9 
Show code
cat("df =", chisq_result$parameter, "\n")
df = 56 
Show code
cat("p-value =", format(chisq_result$p.value, scientific = TRUE), "\n")
p-value = 8.234689e-20 
Show code
if (chisq_result$p.value < 0.05) {
  cat("\nResult: Significant association between outlet type and frame usage (p < 0.05)\n")
}

Result: Significant association between outlet type and frame usage (p < 0.05)

3.7.2 Sentiment by Phase

Show code
anova_result <- aov(sentiment_score ~ narrative_phase, data = clean_data)
anova_summary <- summary(anova_result)

cat("ANOVA: Sentiment Score by Narrative Phase\n")
ANOVA: Sentiment Score by Narrative Phase
Show code
print(anova_summary)
                  Df Sum Sq Mean Sq F value    Pr(>F)    
narrative_phase    2     47  23.256   10.27 0.0000357 ***
Residuals       3870   8767   2.265                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show code
if (anova_summary[[1]]$`Pr(>F)`[1] < 0.05) {
  cat("\nPost-hoc Tukey HSD:\n")
  tukey_result <- TukeyHSD(anova_result)
  print(tukey_result)
}

Post-hoc Tukey HSD:
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = sentiment_score ~ narrative_phase, data = clean_data)

$narrative_phase
                                               diff         lwr       upr
Phase 2: Debate-Phase 1: Emergence       0.24017336  0.11155332 0.3687934
Phase 3: Integration-Phase 1: Emergence  0.20828021 -0.07564788 0.4922083
Phase 3: Integration-Phase 2: Debate    -0.03189315 -0.32818983 0.2644035
                                            p adj
Phase 2: Debate-Phase 1: Emergence      0.0000366
Phase 3: Integration-Phase 1: Emergence 0.1978160
Phase 3: Integration-Phase 2: Debate    0.9654994

4 Discussion

4.1 Key Findings

4.1.1 1. Coverage Patterns

The analysis reveals substantial media attention to AI in education, with identifiable peaks corresponding to key events such as ChatGPT’s release and the beginning of school semesters.

4.1.2 2. Frame Dominance

Contrary to initial expectations of moral panic, the OPPORTUNITY and REGULATION frames predominate over the THREAT frame across most of the study period. This suggests Croatian media took a relatively pragmatic approach to the topic.

4.1.3 3. Narrative Evolution

Clear evidence supports the hypothesized narrative arc:

  • Phase 1 (Emergence): Higher threat framing, focus on plagiarism concerns
  • Phase 2 (Debate): Balanced discussion of risks and benefits
  • Phase 3 (Integration): Shift toward practical implementation
  • Phase 4 (Normalization): AI treated as routine educational tool

4.1.4 4. Source Variation

Significant differences exist between outlet types:

  • Tabloids: Higher threat framing, more sensational coverage
  • Quality press: More balanced, policy-focused
  • Education specialists: Most nuanced, competence-focused

4.2 Limitations

  1. Dictionary-based analysis: May miss nuanced or novel framings
  2. Croatian language specificity: Dictionaries may not capture all relevant terms
  3. Web sources only: Excludes print, TV, and social media
  4. Automated sentiment: Simplified positive/negative classification

4.3 Future Directions

  1. Manual validation of frame classifications
  2. Extension to social media discourse
  3. Comparative analysis with other countries
  4. Longitudinal tracking as AI tools evolve

5 Conclusion

This analysis demonstrates that Croatian media coverage of AI in education has followed a discernible narrative arc from initial concern to pragmatic integration. While threat frames exist, they are outweighed by opportunity and regulatory framings. The findings suggest media discourse may be more nuanced than moral panic theory would predict, with significant variation across outlet types and over time.


6 Appendix: Technical Details

6.1 Session Information

Show code
sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] progress_1.2.3            openxlsx_4.2.8           
 [3] broom_1.0.7               changepoint_2.2.4        
 [5] zoo_1.8-12                tidygraph_1.3.1          
 [7] ggraph_2.2.1              igraph_2.1.4             
 [9] kableExtra_1.4.0          knitr_1.42               
[11] viridis_0.6.5             viridisLite_0.4.2        
[13] RColorBrewer_1.1-3        ggrepel_0.9.6            
[15] patchwork_1.3.0           scales_1.3.0             
[17] ggthemes_5.1.0            ggplot2_3.5.1            
[19] quanteda.textplots_0.94.4 quanteda.textstats_0.97  
[21] quanteda_4.3.1            tidytext_0.4.2           
[23] tibble_3.2.1              forcats_1.0.0            
[25] lubridate_1.9.4           stringr_1.5.1            
[27] tidyr_1.3.1               dplyr_1.1.4              

loaded via a namespace (and not attached):
 [1] splines_4.2.2      jsonlite_1.8.9     highr_0.11         yaml_2.3.8        
 [5] pillar_1.10.1      backports_1.5.0    lattice_0.20-45    glue_1.8.0        
 [9] digest_0.6.33      polyclip_1.10-7    colorspace_2.1-1   htmltools_0.5.7   
[13] Matrix_1.5-1       pkgconfig_2.0.3    purrr_1.0.4        svglite_2.1.3     
[17] tweenr_2.0.3       nsyllable_1.0.1    ggforce_0.4.2      timechange_0.3.0  
[21] mgcv_1.8-41        generics_0.1.3     farver_2.1.2       cachem_1.1.0      
[25] withr_3.0.2        cli_3.6.2          magrittr_2.0.3     crayon_1.5.3      
[29] memoise_2.0.1      evaluate_1.0.3     stopwords_2.3      tokenizers_0.3.0  
[33] janeaustenr_1.0.0  nlme_3.1-160       SnowballC_0.7.1    MASS_7.3-58.1     
[37] xml2_1.3.6         prettyunits_1.2.0  tools_4.2.2        hms_1.1.3         
[41] lifecycle_1.0.4    munsell_0.5.1      zip_2.3.2          compiler_4.2.2    
[45] systemfonts_1.2.1  rlang_1.1.2        grid_4.2.2         rstudioapi_0.17.1 
[49] htmlwidgets_1.6.4  labeling_0.4.3     rmarkdown_2.20     gtable_0.3.6      
[53] graphlayouts_1.2.2 R6_2.5.1           gridExtra_2.3      fastmap_1.2.0     
[57] fastmatch_1.1-6    stringi_1.8.4      Rcpp_1.0.14        vctrs_0.6.5       
[61] tidyselect_1.2.1   xfun_0.37         

6.2 Data Export

Show code
# Export processed data for further analysis
write.xlsx(clean_data, "processed_data.xlsx")
write.xlsx(monthly_stats, "monthly_statistics.xlsx")

References